AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multi-task distillation

# Multi-task distillation

Deepseek R1 Distill Qwen 32B Unsloth Bnb 4bit
Apache-2.0
DeepSeek-R1 is the first-generation inference model launched by the DeepSeek team. Through large-scale reinforcement learning training, it does not require supervised fine-tuning (SFT) as an initial step and demonstrates excellent inference capabilities.
Large Language Model Transformers English
D
unsloth
938
10
Xtremedistil L12 H384 Uncased
MIT
XtremeDistilTransformers is a task-agnostic Transformer model distilled through task transfer learning, creating a small universal model applicable to any task and language.
Large Language Model Transformers English
X
microsoft
471
15
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase